Skip to content

Guardrails: Route embedder chat through guardrails and add UI toggle#234

Merged
dborovcanin merged 1 commit into
mainfrom
guardrails
May 27, 2026
Merged

Guardrails: Route embedder chat through guardrails and add UI toggle#234
dborovcanin merged 1 commit into
mainfrom
guardrails

Conversation

@FilipCivljak

Copy link
Copy Markdown
Contributor

What type of PR is this?

This is a feature because it adds content guardrails to the embedder chat pipeline with a UI toggle to enable/disable them at runtime.

What does this do?

Adds a POST /guardrails/validate fast pre-filter endpoint to the Python guardrails service that checks user input against safety patterns (jailbreak, prompt injection, restricted topics, toxicity, etc.) before forwarding to NeMo
Adds a GuardedClient Go wrapper around the LLM client that calls the guardrails validate endpoint before streaming chat; blocked queries receive a refusal message instead of reaching the LLM
Exposes GET /api/v1/guardrails and PUT /api/v1/guardrails endpoints on the embedder (auth-gated) for reading and toggling guardrails state at runtime without restart
Adds EMBEDDER_GUARDRAILS_URL env var — empty disables guardrails entirely, defaults to http://guardrails:8001 in the compose stack
Adds a Guardrails monitoring page to the sidebar
Adds a Safety section to the Config page with a live on/off toggle bound to the new API

Which issue(s) does this PR fix/relate to?

Have you included tests for your changes?

No. The guardrails validate endpoint is a thin pass-through wrapper over pattern matching; manual testing was performed by sending known blocked phrases (e.g. "ignore previous instructions") and verifying a refusal is returned without the LLM being called, and by toggling the switch in the Config UI.

Did you document any new/modified features?

Notes

The validate endpoint runs only regex/substring matching (no NeMo/LLM call) so latency is <1 ms. NeMo still processes requests that pass the pre-filter, providing a second semantic layer. The toggle uses atomic.Bool so it is safe to flip at runtime under concurrent requests.

@dborovcanin dborovcanin merged commit d98f299 into main May 27, 2026
5 checks passed
@dborovcanin dborovcanin deleted the guardrails branch May 27, 2026 13:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants